108 results found.
Written
Corpus,
Language Type:
Monolingual
Languages:
Arabic Chinese Czech English Finnish French German Hindi Indonesian Italian Japanese Korean Polish Portuguese Russian Spanish Swedish Thai Turkish
Availability:
Freely Available
License:
CC-BY-SA
Size:
300 KByte Production Status:
Newly created-finished
Use:
Emotion Recognition/Generation
-
Paper title:How Universal are Universal Dependencies? Exploiting Syntax for Multilingual Clause-level Sentiment Detection
-
Paper track:Written/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Hiroshi Kanayama | Parallel Sentiment | /N |
Documentation:
For 19 languages (ar,cs,de,en,es,fi,fr,hi,id,it,ja,ko,pl,pt,ru,sv,th,tr,zh)
Speech/Written
Lexicon,
Language Type:
Bilingual
Languages:
Chinese Chinese dialects
Availability:
Freely Available
License:
CC-By-4.0
Size:
400 lexemes Production Status:
Newly created-finished
Use:
Lexicon Creation/Annotation
-
Paper title:CLDFBench: Give Your Cross-Linguistic Data a Lift
-
Paper track:Infrastructural Issues/Large Projects/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Johann-Mattis List | cldf-datasets/normansinitic: Structural and lexical data for the paper by Norman (2013) on Chinese dialect classification | /N |
Documentation:
Documentation provided in English
Written
Evaluation Data,
Language Type:
Multilingual
Languages:
Arabic Chinese English
Availability:
From Data Center(s)
License:
LDC
Size:
303833 words Production Status:
Existing-used
Use:
Information Extraction, Information Retrieval
-
Paper title:Towards Few-Shot Event Mention Retrieval: An Evaluation Framework and A Siamese Network Approach
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Bonan Min | ACE (Automatic Content Extraction) 2005 Corpus | /N |
Documentation:
Yes. English. Yes.
Written
Corpus,
Language Type:
Multilingual
Languages:
Chinese English Japanese Others
Availability:
Freely Available
License:
Size:
353,055 entries Production Status:
Newly created-finished
Use:
Spelling Correction, Grammatical Error Correction
-
Paper title:GitHub Typo Corpus: A Large-Scale Multilingual Dataset of Misspellings and Grammatical Errors
-
Paper track:Written/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Masato Hagiwara | GitHub Typo Corpus | /N |
Documentation:
None
Written
Evaluation Data,
Language Type:
Monolingual
Languages:
Chinese
Availability:
Freely Available
License:
Size:
251 KByte Production Status:
Newly created-finished
Use:
Entity linking
-
Paper title:CLEEK: A Chinese Long-text Corpus for Entity Linking
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Weixin Zeng | CLEEK | /N |
Documentation:
Will be provided after acceptance
Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese
Availability:
Freely Available
License:
CreativeCommons
Size:
12.7 MByte Production Status:
Newly created-finished
Use:
Document Classification, Text categorisation
-
Paper title:Development and Validation of a Corpus for Machine Humor Comprehension
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yuen-Hsien Tseng | Chinese Humor Corpus | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese
Availability:
Freely Available
License:
CreativeCommons
Size:
6.7 MByte Production Status:
Newly created-finished
Use:
Dialogue
-
Paper title:MPDD: A Multi-Party Dialogue Dataset for Analysis of Emotions and Interpersonal Relationships
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Hen-Hsen Huang | Multi-Party Dialogue Dataset | /N |
Documentation:
None
Written
Corpus,
Language Type:
Multilingual
Languages:
Chinese English German
Availability:
Freely Available
License:
OpenSource
Size:
3302268 entries Production Status:
Newly created-finished
Use:
Corpus Creation/Annotation
-
Paper title:MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Lifeng Han | MultiMWE corpora | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese
Availability:
From Owner
License:
Size:
82 MByte Production Status:
Newly created-finished
Use:
Opinion Mining/Sentiment Analysis
-
Paper title:The Design and Construction of a Chinese Sarcasm Dataset
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Xiaochang Gong | Chinese Sarcasm Dataset | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese
Availability:
From Owner
License:
Size:
5.3 MByte Production Status:
Newly created-finished
Use:
Opinion Mining/Sentiment Analysis
-
Paper title:Target-based Sentiment Annotation in Chinese Financial News
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Chaofa Yuan | FiTSA | /N |
Documentation:
No.




